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ABSTRACT 

This paper presents a summary of findings from a 
review of approximately 225 studies of K-12 classroom teachers' 
attitudes toward and other educators' attitudes and support of 
teacher-made tests and testing practices. The findings from the 
review indicate classroom teachers have a positive attitude toward 
teacher~made tests and regard these tests as having a far more 
positive impact upon their day"to"day instruction than do other types 
of tests. Further, teachers' positive regard frr these tests is 
reflected in their heavy reliance upon and frequent use of these 
self-constructed tests in their classrooms. In contrast, other 
educators express a positive attitude toward teacher-made tests and 
testing in K-12 classrooms, but this attitude is not reflected in the 
limited extent to which preservice and inservice training and other 
basic resources such as test typing, duplication, and scoring 
services are made available to teachers in meeting their day-to-day 
testing responsibilities. Four tables are included. (Contains 71 
references.) (Author) 



ERLC 



5V 5V 5V Vc iV * * * Vc it * Vc Vc Vc * 5'c V: * Vc ic >'f i< -k Vf i< Vc >V i< >V >V * Vc it Vr iV Vc )V ?V >V Vc ii it it it V: it 5'c it 5V it it it A A >V :^ -.V * :rtr * it 

Reproductions supplied by EDRS are the best that can be made ''^ 
''^ from the or iginal document . ''^ 

>'c Vc it i< Vc Vc Vc Vc Vc Vc Vc Vc Vc Vc Vc Vc Vc Vf Vc i< it 5V Vc Vc Vc i< it it it i< t't it it it it it it it it it it it it it it it it V< Vc ic it it it it iz it it it Vc Vc Vc Vc Vc it it Vc it it it 



U.S. DEPARTMENT OF EDUCATION 
Ottice of Educal'onai Research and impfov^moni 
EDUCAtlONAL RtSOURCES INFORMATION 

/ CENTEBIERICI 
CVinis document has Deen repfodjced as 
received ifom the person o' ofga^'iaf'O" 
Ofiginaling il 
C Minor Changes hSwe t)een made lo n.iprove 
reproduction Quaiily 

• Poinis 01 viewer opmior^ssiaiedinihisdocu 
mant do not necesMMiy repreaeni official 
OERI posiiion or pOliCy 



•PERMISSION TO HEPRODJCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERICi ' 



r4 

ON 



A Summary of Published Research: Classroom Teachers' and 
Educators' Attitudes Toward and Support of Teacher-Made Testing 



Fred L. Pigge and Ronald N, Marso 
College of Education and Allied Professions 
Bowling Green State University 
Bowling Green, Ohio 43403 



A paper presented at the Annual Meeting of the 
Midwestern Educational Research Associa' Lon 



Chicago 
October 13-16, 1993 



Abstract 

This paper presents a summary of findings from a review of 
approximately 225 studies of K-12 classroom teachers' attitudes toward 
and other educators' attitudes and support of teacher-made tests and 
testing practices. The findings from the review indicate classroom 
teachers have a positive attitude toward teacher-made tests and regard 
these tests as having a far more positive impact upon their day-to-day 
instruction than do other types of tests. Further, teachers' positive 
regard for these tests is reflected in their heavy reliance upon and 
frequent use of these self-constructed t^sts in their classrooms. In 
contrast, other educators express a positive attitude toward 
teacher-made tests and testing in K-12 classrooms, but this attitude 
is not reflected in the limited extent to which preservice and 
inservice training and other basic resources such as test typing^ 
duplication, and scoring services are made available to teachers in 
meeting their day-to-day testing responsibilities. 
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A Summary of Published Research: Classroom Teachers' and 
Educators' Attitudes Toward and Support of Teacher-Made Testing 

Most K-12 teachers feel that teacher initiated 
assessments have a major impact upon pupil learning and spend 
considerable time i.: various forms of evaluation each class day- 
Teachers believe that c"" assroom tests guide and instigate pupil 
learning efforts (Rogers, 1969), that the nature of classroom tests 
influences their pupils' study habits (D'Ydewalle, Swerts, & Decorte, 
1983), that testing frequency influences pupil achievement 
(Baugert-Downs, Kulik, & Kulik, 1988), that carefully administered, 
announced, and monitored classroom tests produce higher pupil 
performance (Hill & Wigfield, 1984), and that prompt return of 
classroom tests accompanied by the provision of knowledge of results 
increases pupil achievement (Kulik & Kulik, 1986). 

Although classroom teachers frequently use and strongly believe 
in the positive benefits of teacher-made tests and educators believe 
testing and evaluation is one of the most potent forces influencing 
education. Crooks (1988) contends that the actual elements of the 
evaluation process in the K-12 classroom have received less attention 
from researchers than have many other aspects of education including 
standardized testing. Similarly, Stiggins, Conklin, and Bridgeford 
(1986) described the existing research related to classroom tests and 
testing practices to be limited and narrow in scope. 

The purpose of this paper is to provide a bibliography and 
selected findings from a n\ore extensive review of the research 
literature addressing K-12 ' classroom teachers' skills and knowledge 
related to the development and use of teacher-made tests. The full 
report of the findings from this review is scheduled to appear as a 
chapter in Teacher Training in Assessment , Steven Wise editor, in 
volume seven of the Buros Nebraska Symposium in Measurement and 
Testing. The present paper provides information related to just the 
following two of the several questions addressed in the more extensive 
literature review: 1) What attitudes do educators have toward 
teacher-made tests and testing practices as revealed through various 
self report procedures and as revealed through the extent of training, 
support, and resources made available for these activities? 2) What 
atti.tudes do classroom teachers have toward teacher-made tests and 
testing practices as revealed through various self report procedures, 
their testing practices, and analyses of their self-constructed tests? 

The research studies reviewed for the larger study were 
identified through computer searches of the ERIC data base and through 
the gathering of those reports cited within the computer-identified 
studies. These procedures resulted in the collection of approximately 
225 research reports- 
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Question One: 

Educators' Attitudes Revealed Through Self Reports 
and Availability of Training and Resources 
for Classroom Testing 

Testing Standards and Codes 

Until the standards for K-12 classroom teacher competence in the 
assessment of pupils ( NCME-ASCTE-AFT-NEA) were published in 1990, the 
testing community had not provided clear expectations of or standards 
for classroom teachers' testing competence. In contrast, the 
statements of standards for standardized testing can be traced back to 
the mid-twentieth century and are currently conveyed in the Standards 
for Educational and Psychological Testing which were jointly developed 
by the American Educational Research Association, the American 
Psychological Association, and the National Council on Measurement in 
Education (AERA~APA-NCME, 1985). More recently these latter standards 
were supplemented by the 1988 Code of Fair Testing Practices in 
Education also jointly sponsored by these three professional 
associations. The Code was designed to complement the earlier 
standards and differs from the standards in audience addressed and 
purpose. It is focused just upon standardized educational testing but 
addresses the practices of both test developers and test users. Its 
stated primary role is to address test and test score misuses which 
have tended to generate far more public criticisrr, than have questions 
about test quality itself (Diamond G Fremer, 1989). 

Neither the 1988 Code nor the 1985 Standards address 
teacher-devised testing. Frisbie and Friedman (1987) did make an 
effort to show a relationship between the 1985 Standards and 
teacher-devised testing; however, the results of their effort were 
illustrative rather than enumerative in scope. Thus, it appears that 
the measurement community has provided less professional guidance for 
and, as noted previously, less research of teacher-made testing than 
it has for standardized testing. This relative neglect of teacher- 
devised testing has occurred in spite of the fact that the measurement 
profession perceives teacher-made tests and not standardized tests to 
be the dominant influence in K-12 classrooms (Stiggins, 1985). 

Even though the measurement community appears to have provided 
less research support and professional guidance for teacher-devised 
testing in contrast to standardized testing, it appears to have 
considerable doubts about the testing knowledge, skills, and practices 
of educators. For example. Diamond and Fremer (1989) noted that the 
Institute for P.esearch on Teaching, which coordinated the development 
of the previously described fair testing code, was particularly 
critical of the inadequate training of educational personnel relative 
to the interpretation and use of tests. 

Testing Resources 

The perceptions of the extent to which testing expertise and 
other resources are available to support teacher-devised testing 
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activities in the K-12 schools appears to be as bleak as the 
measurement community's perceptions of the adequacy of teachers' 
testing conpetencies . Ruddell (1985), after conducting interviews of 
school principals and school district central office staff relative to 
the availability of testing expertise in K-12 schools, concluded that 
they possessed very limited knowledge about tests and test score 
interpretation concepts such as the standard error of measurement. 

Marso and Pigge (1990) conducted a survey of school district 
designated directors of standardized testing and found that many 
school testing directors, themselves, have limited training in testing 
and evaluation. Further, many of the testing directors when queried 
about support services which they provided for classroom teachers, 
contrary to the expectations stated in the Standards for Educational 
and Psychological Testing , reported that they were not responsible for 
encouraging the use of standardized test results in their schools^ for 
training teachers to proctor standardized tests, and for training 
teachers to better interpret scores from standardized tests. 

Marso and Pigge also found that many of the testing directors 
reported increased demands on their time resulting from added 
responsibilities for the management of mandated statewide pupil 
competency testing; thus, undoubtedly, also reducing the testing 
directors' opportunities for providing teachers with testing expertise 
or related testing support services. These researchers concluded that 
it is probably safe to assume that if testing directors do not provide 
basic testing support services for teachers, then these essential 
services probably are not being provided in the schools. This 
conclusion was based partly on the assumption that no one else in 
these schools, especially in smaller size school districts, would 
likely have this responsibility or would likely have the expertise to 
deliver such services. 

Relatedly, Stiggins (1985) noted that few school administrators 
have the training or the experience necessary to help teachers with 
classroom testing or related responsibilities. As further evidence of 
this lack of expertise, Marso and Pigge (1989c) reported negative 
correlations between principals' and supervisors' ratings of teachers' 
various question type writing skills (e.g., ability to write 
multiple-choice and other types of questions) and the observed levels 
of the adequacy of teachers' various question writing skills as 
displayed on their self-constructed tests. As the adequacy of the 
teachers' question writing skills in this study was judged upon the 
frequency of violations of common test construction guidelines, this 
finding may suggest that school administrators, who themselves tend to 
have little or no training in testing, may not have sufficient 
awareness of common test item flaws to be able to identify question 
writing violations in teacher-constructed tests let alone effectively 
advise teachers how to avoid these violations - 

Lambert (1980-'81) collected opinions about teachers' attitudes, 
training, and knowledge about teacher-made and standardized tests from 
a national sample of state legislators, state teacher association 
officials, and deans of colleges of education. He found both 
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agreement and divergence between and within these three samples. For 
example, approximately one-third of the deans reported that their 
colleges did not offer a measurement course for their teacher 
candidates and that they had no intention of doing so; nevertheless, 
the deans agreed with one another that classroom teachers have a 
negative attitude toward standardized tests, that teachers should know 
more about tests, and that it is very important for teachers to 
construct superior tests for the assessment of their pupils. 
Ultimately, Lambert concluded that all three groups sampled needed to 
know more about the value and limitations of tests. 

In regard to the extent to which resources are made available to 
support teachers' testing activities, Marso and Pigge {1988c) asked 
over 800 K-12 teachers, principals, and supervisors to report the 
extent to which selected resources were available in their schools to 
support classroom teachers' testing responsibilities. They found that 
even basic typing and duplication services were not consistently 
available in 50% of the schools, grade assignment guidelines were not 
available in 50% of the schools, and basic computer services (e.g., 
test scoring, item pools, item analyses, etc.) were not available in 
approximately 75% of the schools. 

Dorr-Bremme (1983), after using questionnaire and interview 
procedures to gather data from a national sample of school staff in 
114 school districts, reported that most K-12 teachers do not receive 
inservice training or assistance of other types in selecting, 
developing, and use of tests. Rather significantly, this researcher 
found a relationship between teachers' attitude toward school testing 
and the extent to which school support for testing was made available 
in forms such as expressed principal interest, test interpretation 
assistance, and inservice training related to testing. In school 
districts where these testing support services were more extensive, 
teachers' attitude toward testing was positive; in school districts 
where these resources and services were very limited; teachers' 
attitude toward testing was less positive. In another study related 
to the availability of support for testing, Gullickson (1984) also 
found that teachers reported having little assistance available for 
the preparation, analysis, scoring, or interpretation of teacher-made 
tests. 

Training Resources 

Hermanowicz (1980) argued that a major component in K-12 
teachers' preservice education ought to be training in the development 
and use of classroom tests. Practicing teachers, themselves, report 
that assessment of pupils is a key element in the instructional 
procv^ss, and measurement specialists such as Stiggins, Conklin, and 
Bridgeford (1986) and Dorr-Bremme (1983) have provided information 
describing how classroom teachers do integrate testing within their 
day-to-day instructional practices. Further, Schafer and Lissitz 
(1987) reported an increasing awareness of the importance of teachers' 
pupil assessment skills within the educational community as evidenced 
by the positive positions taken by the two major national teacher 
organizations on pupil aesessments and by the inclusion of testing as 
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one of the five skill components measured by the recently revised 
National Teachers Examination. 

Despite this evidence of the educational community's awareness of 
teacher need for pupil assessment competencies, considerable evidence 
exists which indicates that a significant proportion of professional 
school personnel receive little or no formal training in measurement 
and evaluation. After conducting a survey of 438 institutions of 
higher education, Schafer and Lissitz (1987) found that approximately 
only one-third of various K~12 educational personnel preparation 
programs required a measurement course for certif icat ion. Even more 
disconcerting, they found that approximately just 25% of the 
elementary and secondary teacher preparation programs required a 
measurement course. They further noted that, although administrators 
are expected to serve as instructional leaders in schools, the 
administrator education programs were least likely of all preparation 
programs to require measurement training. Among the. advanced 
certification programs for educators, they found that only the 
counseling programs are very likely to have a measurement course 
requirement . 

Gullickson and Hopkins (1987) conducted a regional survey of 99 
colleges of education and found that approximately one~half of the 
colleges provided a measurement course for their preservice teachers; 
whereas the other colleges provided just a unit of instruction in 
measurement within another course. In an earlier study Roeder (1973) 
surveyed 850 colleges of education and found that somewhat fewer than 
one-half of their elementary teacher preparation programs required a 
separate tests and measurement course. 

Relatedly, Green and Williams (1989) found that classroom 
teachers with more training in measurement reported scheduling 
teacher-made tests more frequently and using the results of 
si:andardized tests more extensively than did teachers with less 
training, A rather disturbing finding b^ these researchers was that 
the less well trained teachers perceived themselves to be more 
knowledgeable about interpreting the results of tests than did the 
better trained teachers. In a similar earlier study. Green and Stager 
(1986-87) found that the extent of teachers' training in testing did 
not influence the frequency of their use of teacher-made tests; 
however, they did find that the more well trained as compared to the 
less well trained teachers were more likely to use appropriate test 
development practices such as item analysis and test specification 
tables . 

Educators typically avoid measurement training when not required 
in their preparation programs (Coffman, 1983; Schafer & Lissitz, 1987; 
Stiggins & Bridgeford, 1982). Some individuals have suggested that 
educators may avoid measurement training because the training being 
provided has not been designed to meet practical classroom needs 
Airasian & Madaus, 1983; Stiggins & Bridgeford, 1985). In support of 
this speculation, Gullickson (1986a) found discrepancies between 
common college measurement course topics and practicing teachers' 
perceptions of what testing topics ^d skills are needed to 
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successfully function in their classrooms. He reported that classroom 
teachers rely heavily upon informal observations of and direct 
communications with pupils in making instructional decisions, and they 
perceive little need for statistical testing procedures. In contrast, 
Gullickson noted that preservice measurement instruction tends to 
focus upon paper and pencil measurement assessments and statistical 
analyses of data rather than upon informal data gathering procedures. 

The findings from several other studies also indicate that 
discrepancies do exist between K-12 classroom teachers' testing 
practices and typical educational measurement training. For example^ 
Gullickson and Ellwein (1985) and M-irso and Pigge (1991) found that 
few practicing teachers use statistical analysis procedures in 
interpreting pupil test performance, Kellaghan, Madaus, and Airasian 
(1982) reported that measurement training has resulted in little real 
impact upon teachers' testing practices and suggested that it is 
unlikely to do so until it focuses on the actual demands of pupil 
assessment in K-12 classrooms. Further complicating these concerns 
about measurement training for classroom teachers, Gullickson and 
Hopkins (1987) found that many preservice measurement professors, 
themselves, have limited measurement training and/or experience in the 
use of tests in K-12 classroom settings. 

In addition to the major concerns about K-12 teachers having 
little or no preservice training in testing and whether such training 
is appropriate, several researchers have reported that inservice 
teacher training in testing is almost nonexistent (Dorr-Bremme, 1983; 
Gullickson, 1984), and Marso and Pigge (1991) found indirect evidence 
of this in that neither teachers' ratings of their own testing 
proficiencies nor the observed quality of their teacher-made tests 
differed when these ratings and tests were grouped by the teachers' 
years of teaching experience. Further, teachers frequently perceive 
their inservice training to be not very helpful. For example, Marso 
and Pigge (1987b) found that of all school experience factors 
assessed, first-year teachers were most disappointed with their 
inservice training. Relatedly, Stiggins (1988) suggests that teachers 
will seek inservice training designed to improve their tests and 
testing practices but will avoid inservice measurement training if it 
is perceived to be like that provided in preservice training. 

In conclusion and as summarized in Table 1, it is apparent that 
K-12 teachers are perceived by the educational and measurement 
communities to have limited testing knowledge and skills; that neither 
measurement consultative expertise nor inservice training in testing 
is generally available to teachers in most schools; that even basic 
testing support services such as typing and duplication assistance are 
not commonly available to teachers in a large number of schools; that 
a large portion of classroom teachers have had little or no formal 
preservice or inservice measurement training; and that much of the 
training in pupil assessment which is available to teachers and 
teacher candidates is perceived by practicing teachers to be 
inappropriate relative to actual classroom instructional needs. 
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Insert Table 1 about here 



Question Two: 

Teachers' Attitudes Revealed Through Their Self Reports, 
Testing Practices, and Self -Constructed Tests 

Testing Beliefs and Practices 

Mehrens and Lehmann (1987) have estimated that a typical pupil 
will take between 400 and 1000 teacher-made tests during their K-12 
school years. Crooks (19S8) and Haertel (1986), have indicated that 
approximately 5 to 15 percent of a typical classroom day is devoted to 
pupil assessment, Newman and Stailings (1982) and Stiggins (1988) have 
estimated that teachers spend approximately 11 to 20 percent of a 
typical work day in some aspect of pupil assessment, and Marso and 
Pigge (1991) reported that K-12 teachers construct an average of 54.6 
formal paper and pencil tests in a typical school year. 

Teachers use both self-constructed tests and publisher- 
constructed (textbook or workbook) tests but prefer their own tests. 
Dorr-Bremme (1983), studying a national sample of teachers, reported 
that 95 percent of the teachers used self-constructed tests and 77 
percent used publisher-constructed tests. But regardless of test 
source^ teachers and pupils spend considerable classroom time and 
effort in testing activities (Fleming & Chambers, 1983), 

Teachers' opinions about what are appropriate testing practices 
appear to vary somewhat by grade level of instruction and by subject 
area content being assessed. At the upper grade levels, teachers rely 
more on teacher-constructed as compared to publisher-constructed 
tests, express more concerns about the qpjality of pupil assessments, 
and use somewhat more test qualit" control procedures such as item 
analysis and checks on reliability than do teachers in the lower 
grades (Marso & Pigge, 1991; Stiggins & Bridgeford, 1985). Primary 
grade teachers place more focus on pupil work samples than upon 
testing; lower elementary grade teachers more frequently use 
worksheets and tests provided in publisher textbooks and workbooks 
than do teachers at other grade levels; and upper grade and high 
school teachers predominantly ub^ formal self-constructed tests in 
their assessment of pupils (Herman & Dorr-Bremme, 1982; Marso & Pigge, 
1991; Salmon-Cox, 1981). 

Essay questions and tests appear not to be held in high regard by 
teachers and are very seldom used at any grade level. Essay questions 
are more frequently found in English, history, and social studies 
tests than in other subject area tests? and they are more frequently 
used in the upper grades than in the lo-^N/er grades. Math and science 
teachers test their pupils more frequently but are less likely to use 
essay items than other subject area teachers* Math and science 
teachers are more likely to use formal paper and pencil tests than 
informal assessments. Teachers in writing and speech classes are more 
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likely than are other teachers to use direct observations and informal 
judgmQnts in assessing the progress of their pupils (Marso & Pigge, 
1991; Stiggins & Bridgeford, 1985), 

Upper grade level teachers believe that letter grades or marks 
should be based primarily on pupil test performance and daily work? 
whereas K-4 grade teachers believe that daily work and observations 
are more important than tests in assigning grades. Most teachers when 
assigning marks consider teacher-devised tests to be a primary source 
of information (Marso, 1986; Shulman, 1980). 

Teachers generally favor self-constructed items as compared to 
items from other sources, and they typically report constructing from 
50 to 75 percent of the test questions used on their tests. Teachers 
also favor the use of a variety of test items with an average of 2.6 
question types found on a typical te- 2r-devised test (Dorr-Bremme , 
1983; Marso & Pigge, 1991; Yeh, 1981). 

A combination of completion or short-response type questions 
followed by matching, multiple-choice, true-false, and essay type 
questions are most frequently used in teacher-devised tests. When 
teachers are asked to rate the usefulness, adaptability, and fairness 
to pupils of the various question types, the question types are placed 
in a somewhat different order: matching, completion, short-response, 
multiple-choice, true-false and essay. Teachers do believe that 
pupils study more for essay tests as compared to objective tests and 
that essay tests are more likely to function at higher cognitive 
levels than are objective tests even though they deem the essay items 
to be less useful and seldom use them (Coffman, 1971; Marso, 1985). 

Most classroom teachers provide information to pupils regarding 
their performance following the administration of a classroom test, 
and typically they report spending about one-half of a class period 
for that purpose. Teachers report that pupils usually are very 
attentive and motivated during these test feedback sessions (Haertel, 
1986). Teachers tend to reuse their tests without analysis and 
revision and seldom use statistical procedures to assess the quality 
of their tests (Gullickson & Ellwein, 1985; Marso & Pigg^, 1988b). 

Few studies describe exactly how teachers use tests in their 
classroom instruction (Kuhs et al., 1985). Teachers, themselves, 
report a heavy reliance on teacher-made tests in their day-to-day 
instruction. In contrast they report placing little reliance on 
standardized tests for making instructional decisions (Salmon-Cox, 
1981). Relatedly, Borg, Worthen and Valcarce (1986) reported 
unfavorable and indifferent teacher attitudes toward the clasksroom use 
of standardized tests but highly positive teacher attitudes toward the 
use of teacher-made tests. Stiggins and Bridgeford (1985) reported 
that classroom teachers did use their self-constructed tests for pupil 
diagnosis, grouping, grading, evaluation, and the reporting of pupil 
progress in their classrooms and that they placed more emphasis upon 
structured performance assessments than upon spontaneous observations 
of pupils in making instructional decisions. 
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A common criticism of teachers is that they tend to over value 
test scores, and in particular standardized test scores, relative to 
other available information about pupils. Hall, Carroll, and Comer 
(1988) found, however, that classroom teachers consistently favored 
the results of their self -constructed tests over the results of 
standardized or state competency tests in making decisions. They also 
noted that teachers made decisions with a reasonable regard for the 
complex data requirements present in claasroom settings. Similarly, 
Lazar-Morr ison , Polin, Moy, and Burry (1980) concluded that teachers 
place greater confidence in the results of their own judgments of 
pupil performance than upon any formal tests; and Stiggins and 
Bridgeford (1985) reported that teachers rely on a number of sources 
of information in making decisions about pupils and that teachers' 
relative reliance on sources of pupil information is in the following 
order: teacher-made tests, standardized tests, structured performance 
assessments, and spontaneous observations. 

Dorr-Bremme (1983) concluded that teachers bring several types of 
assessments to their decisions about pupils and that they rely more on 
personal experiences and observations than upon test scores. 
Similarly, Salmon~Cox (1981) reported that high school teachers made 
very little use of standardized test scores in evaluating pupils; 
Shavelson, Cadwell and Izu (1977) found that teachers gave due 
consideration to the reliability of data in making decisions about 
pupils; and Kellaghan, MadauG, and Airasian (1982) found that teachers 
can accurately predict pupil test performance and only use students' 
standardized test scores to corroborate their own judgments. 

The findings of the research related to teachers' use of test 
scores suggests that classroom teachers use scores to raise but not to 
lower their expectations of individual pupils. When teachers note a 
discrepancy between their perceptions of a pupil's ability and test 
scores, teachers ignore test scores »hGn the scores suggest that less 
might be expected of a pupil, and teachers raise their expectations of 
a pupil when test scores suggest that more might be expected of a 
pupil ( Airasi-r-'i, Kellaghan, Madaus, & Pedulla, 1977). 

Two studies of teachers' attitudes toward educational testing 
appear to be representative of teacher perceptions of tests and 
testing. Green and Stager (1986-87) surveyed 555 classroom teachers 
and reported that younger teachers are more skeptical of testing than 
older teachers, that upper grade teachers are more positive toward 
testing than lower grade teachers who place more emphasis on classroom 
observations and informal pupil assessments, that teav-hera are 
positive toward teacher-made tests but tend to be negative or 
indifferent about standardized tests, that most teachers express 
interest in upgrading their testing skills, and that reported use of 
contemporary measurement practices (e.g., use of test specification 
tables and item analysis, etc.) was found to be somewhat related to 
more frequent pupil testing practices but not to attitude toward 
testing . 

In a second study of teachers' attitudes and beliefs about tests, 
Gullickson (1984) reported that teachers felt that teachers-constructed 
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tests result in increased pupil effort, influence pupil self-concept, 
create desirable competition among students, improve interaction among 
pupils, improve the classroom learning environment, better focus 
teaching, provide a good learning experience for pupils, motivate 
pupil s^-.udy, and accurately reveal pupil progress. Further, 
Gullickson found that teachers believe that: frequent brief testa are 
more desirable than infrequent lengthy teats, school administrators 
encourage frequent testing of pupils, pupils prefer frequent tests, 
pupils try hard on tests, tests are an important instructional tool, 
tests need to be tied closely to instruction, tests help evaluate 
instruction, essay tests better assess pupil progress than objective 
items and measure at a higher cognitive levels, tests should not be 
the sole determinant of 9rades, and that tests are necessary to help 
justify grades to parents. 

Pupils appear to reflect the attitudes of their teachers about 
tests, for students also feel that tests help them learn, and they too 
favor frequent testing. Pupils report that teacher-made tetits must be 
taken more seriously and are more difficult than standardized tests 
(KuliJc & r-ilik, 1981), and, like many teachers, some pupils feel that 
standardized tests are a waste of time (Stetz & Beck, 1981). 

In summation, this review of teachers' testing practices and 
beliefs suggests that K-12 classroom teachers appear to have a very 
favorable attitude toward teacher-made tests and testing: they feel 
it is appropriate to and expend considerable effort and time in 
fulfilling teacher instigated testing responsibilities in their 
classrooms; they favor and schedule tests frequently followed by class 
discussions of pupil performance; they do have concerns about but also 
positive feelings about the role of testing and pupil evaluation in 
the instructional process; and they have confidence in their classroom 
tests and their overall testing ability but recognize that they would 
benefit from practical training in testing. A summary of teachers' 
testing practices, beliefs, and attitudes is presented in Table 2. 



Insert Table 2 about here 



Assessments of Teacher-Made Tests 

Few studies of teachers' testing knowledge and skills have been 
conducted wherein direct analyses of samples of their teacher-made 
tests have served as the major data gathering procedure. One such 
study was reported by Fleming and Chambers (1983). They analyzed 342 
teacher-made tests encompassing 8,800 test questions constructed by 
teachers assigned to several grade levels and subject ar». as in the 
Cleveland Public Schools. Some of the more salient findings from this 
study follow: 

1. Short-answer (including f ill-in-the-blank) questions were most 

frequently used followed by matching, multiple-choice, true-false 
(seldom used), and essay questions. Essay items were very 
inf recjuent ly found on any of these teachers' tests (about 1% of 
all questions) . 

12 



12 



2* 



3. 



...ost 80 percent of tj^^^<^,r;,irierer f urctifni^rrte:::""' 

the knowledge level. The ^^9^" ^^1^ throughout all the 

however, rather '^^^-.^^^"^./^/.'jLTath tests. Few question, on 
tests, were ^^^^/'^nriLs^re pupils' ability to maKe 
any tests were judged to measure p p 

applications . 

■ . f ^he tests contained directions for all 
Fewer than two-thirds of the « ^^^^ g.^.p^d by 

question types, -^^^^^^^^f^h.y often were not numbered 
rnfertirefy Tr ^iLTt^;-/^- at all. 

r^^r of the tGSts vjers 

handwritten, were poorly repr 
with content. 

. T,- [^^91) analyzed 6504 test 
Xn a second ^-^^ ' "-^^ ^"^/.^on Ixerclses (a group of 
questions contained w.th.n 4S5 f ^^^^^^^^ formal teacher-made 

^estiona of similar type on a tes^^ one to 10 years of 

Ssts constructed by classroom teachers ^^^^^ ^^d 

Teaching experience "^o^/^^/^f f^, L?Lnt findings from this study 
measurement course. Some ot 
w t 

• H hv orade level and subject area content 
Question type use varied by grade le question 
an "-erage" test mad. up of ^- ^^^^ ,hort-response , 

:rchirg"tru:^re:riti;re--oice, problems, completion, 
interpretive exercises, and essay. 

. ^ in test construction practices or 
very few differences were noted in test co ^^^.^^^ ^^^^^ 

years of teaching experience. 

type. 

Teachers reported P'^^?"^"^ an average of ^54.6 forma^^ 
teacher-made tests each ' /^f^^^^^, „LkB or more frequently 

teachers scheduled a ^ "^^^.^^^'/..^t of the teachers reported 

i;itirsi!rr;rorTor^^ — - - 



follow 



3. 



4. 



testa 
5. AS 



= 11 i-f^«ta. 72 percent were 
a total group of ^-^i^^.r^^ii^rdqe ^ogni^ive level, but 
judged to ^tions functioning beyond th. 

the large majority of the que ^^^^ 
knowledge level were contained ,u3t 



tests. 



,n a study of secondary math -/r4"tertr'crn;ririnrorer 1400 

tests^ Oescher and Ki.bV (19 0 ^^llZVof 3S teachers to a teacher 
test questions and gathered the 
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... They concluded that 70 percent of 
testing practLcea questionnaire. ™ ^ of the tests were 

3uay*=^ ^ teacher-made 

cognitive functioning l^^^J^ teacher training. Black (19^°' 
questions. nstructed by 74 

their stuuy vnowledqe level. 

100 percent at the knowieag 

-Id and Wikeland (1989) conducted 
Similarly, Stiggins, Griswold and ^^^^^^^^ 
interviews, class observations and ^^^^^^^^ , en 

constructed tests of 36 K 1 cla ^^^^^^^ t^etr 

participating in ^"^"^"^"""h a focus on the development of their 
endorsed efforts to teach with a foe ^^^^ ^^^^^ ^.^^^l loO 

r^uoils' thinking skills. They toun ^ions functioning 100 

Teff-constructed tests were --P^/i^^^^.f ,,3 .nth tests. These 
Tetcent at the '^"^^f f,,tTt was efsier to train teachers to teach 
Researchers commented that it wa ^^^^^^ ^^^^ .^^fthese 

rr:in\erh:rrtrdesir/^t:ts\^o measure pupil achievement 



"'^"'^ of the direct analyses of 

Xn sun^ation, the -view of studies o^t^e ^^^^^^,,,,3 , t 

rerpitrtefch'rs. -vora.le attitude t e^teach^ ^^^^^^^ 

TJ.S commonly -^^^^-.f;-,^ ^e r/esources, appear ^ -;%alue 

Teacher-made tests 
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through analyses of their 

"insert Table 3 about here^ 

ci^^ilc. and TrainiSS 

K Belief3_aboutjresting_^k^ 

■Ve^£i^^rB_3sUSt informal in 

a. .»,ors report that they place J"^ classroom decisions 

'^rro formal assessments in making c -s ^^^^^^ .^^.^^^^ 

contrast to rormdJ. logO: Salmon-Cox, 19»i). 

1586.1. M.ch." „„f„l l„ ».klng O"""-^'' mor. 

(Dorr-Bremme, 1983). (ic,87a) 

J K-12 classroom teacuei. instruction than tneij. 

found that K 12 associated with instr 

measurement skills c structurally sound ^est qu ^^ported 

di.g„o.lng P»Pil '"'"ft^" of unit. o£ i„.tr.ctlo„. 

TKe d.t. p««nt.d In / «%i',f.f/ff «.tln, =o.p.t,ncl.. 

lo^- the need for competency in 

" ..V, teacher manuals. 



low the neea — ; ^^^.^Is 
sources such as teacne 

insert Table 4 about here 



in the research findings 



Although there is some ^-"^f^^^Jtesting ability, teachers 
..out^ achL. Pe.cep^^ training in testing 

typically: ^ ^ 
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Table 1 ^. pef lected_in 

^^^^^ TTTlas^roomjreatina 

to_SuEEort_Claaar Poachers' testing 

. 1990 have standards for claBsroom t-ch^^^^^^^. 
1 JuBt sxnce 1^^^ whereas standards iuj. 

con,pet.nce ^^en available, w^-^^^^^^^ ,,3 century. 

testing have existed Bince ^^^^^^^ 

..e e-ational an. — ^J^^^ 

rtrrjir -pup- " 

.,3„any teachers, as well as 
3, .ea.ure.ent -..unity perce.ve^. ,,,33roo. 

others in ^'i""^^""'^^" n le 
testing knowledge and skUls. 

. i 1 able in most n i-^ 

responsibilities. Most training in testing ana 

"eparation programs do not require ^^^^^^^ ^"'^''"tav 
Measurement. Further .any -^/^f^.P^.^entB , the.salves, may 
Teacher candidates in t-^/ J.^s in classroo. test.vng. 
Have United training o. e.p ^^^^^^^^ 

...t K-1. -c-i.nal aa^in^s^^^^^^^^^^^^ - ^^^^^ management 
S rerinnfo^ams in the pu.lio schools. ^^^^^^ 

or no roriuet-^ 

• 4-oanher candidates, 
of their teacher ^^^^^ 

r.r^y to en::ourage teacneL ^4 fi ration tables, or 

^1 t-estinq efforts, 
their pupil testing ^ directors of 

. .s many as 20 percent ^^J^rf fotral -^"-^ ^ re'd^r a 
K-12 school districts h^ve^n ^^^^^^^^ expected 

measurements than 
beginning clasaroom teacner. 
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..... 1 .continued, ,„,Uc.tion 

.v,n -.f. "tva.-^^^ 

4 :rr. r;.r,! rr".rr=f.; ..^ ....... 

aervices aucn a gchoola 
relatively even fewer 
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relatively even i t d to classroom 

testing unless xt i-s ^ ^ • ' ng 

Buch as knowledge of sub^e 



13. 
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,„,„t and .auction o"™""''"! "^.^-duvis.d testing 
"rriv l.erre....oh on =',-""-;,rnr.n^ " 
rorp.»d'to^e...tcn of .t.nd..d-.d 

„p.ct. Of .duction. .„u..iiity o. 

,,.,„d te..«c. •"-roroorr.«rn,Tcti.ui» po.itiv.iv 

irre"erfr.r Ittitnd. to»atd t..tin,. 
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V, =. Attitudes a^.Be£lScMd_in-2]l£i£-S 
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— " +-pflts qeneraliy nave 

positive impact ut< 

pvipil-s- ^ „eet their 

„rf use a^Bessment procedures that 

pupils should closeiy 

^ , thcv must ti.^ uiic.-- 

^h«t for tests to be test results 

must OB j-uuii^ 1 „4-*-a-i-ors and 

5. T..=h". believe, "^^"^f "„,;he,-..d. te.te 'f"" 't„, 

assessments deriveu 

textbooks. greater extent than 

, .e.=.e„ ..1, "^-rrsstcrr-r..: r«e'te„c, ...» 

school districi: p f ^ pupil grade 

.eueve t.,^ --/rt^r./.ee.ent ....oe, 

level variation9 requxr 

„a p«=tlce.. ^^^^^^ ^^^^^ „ 
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Table 2 (continued) 
13. 



14. 

15. 
16. 
17. 



2 (continued) teachers in 

justifying gradeo to pup ^^^^^^^ 

clas.roo. related ,,,erpreted within the 

Helieve that test scores "luBt be ^ ^^^^^ 

behavxors less u reliance on 

l„£om.Uy gathered in ^^^^ ^^^^^^ 

_ A. « rt+- a _ 
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do objective teste. completion, and 

o i-hat matching, short-resp / efficient, and 

-r«rrr.re.. » - 

useful types of quests 

.eachere -^eve that a;-etv;^^^^^^^ ^,,,,3 and to better 
in ^X':,"instructional objectives. 

asse.B a variety ^.^ q^e^tions 

, ■ o that teacher-made teats should ^^^^ 
Teachers believe that te thinking bX^Ib, and 

that aemand higher-orde.^puP^_^^^ ^^^^^ ,33ts. 
over estimate the cog fairly and 

;rrX=%=N5:.^~— ^^^^ "'- 



^""' nt pupil cheating. 

to prevent pup^ interpreted and 



analyses. 



(table continues) 



26 



Table 2 (continued) ^^^^^ .^.easment procedure., 

.eaCers --r^.l^faerrnnrful cla~ 
to be cons latently demands of teacn ^^ies. 

^. •^r.t- in time and enei:^y^ ; nstruct 3Lonal acr^-v 

efficient m classroom insti-u 

3upportive of on-go.n. 

..nd considerable in and typically 
25. Teachers eKpend -n -'^^'-"'^TV JZ o. more often in mo.t 

in testing ^^^^^^ ^^ce every two weeks ^^^.^ ^nd 

schedule formal teats ^^^^^^ ^^^^^i test 

courses, ov,n test gueet.onB. 

construct most of the. and grading activities 

. 3 relieve that testing, e^Xa ^leasan't classroom 
ire aiong their more demanding and 

responsibilities. ^^^^.^,g and 

, ,.,ers commonly express concern about 

rairtion responsibilit.ee. ^^^^^ ^^^^ 

time editing or 
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^^^^ =^ flhort-answer, cou^ ^^^^ 

latching cjueat.on type „,,tiple-cho.ce j.^^^^ ^^^.^ 
cognitive demand are used less 

frequently used, true ^^ently- 
iSlons are used very 



as 



3. 



o = valuing of i-^^"*^ predominantly 
Teachers express a va ^ ,re p 

cognitive functioning level- 

— cognitive -^-rt°est'^ 
....hers appear -J^Snrctl^or s.ill as.ec.s - 

rorsS-o;rnd^e r/-^^^^^^^^ 

;roficiency in -^^^^ .^st ^P-f -^^^ly^i^ procedures, 
-«^^"^^^L'te; writing, and statistical 

analysis, ite .^^^ 
^t'^-^- of are unable to ident- y 

.eachers appear - -7^^ ^-tly ^--rer^'^ests 
■ Litin9 ^^--.fdeUnes, for -^^VB- °^ ^^^^ .o.^on test 

question and test training in test^ g 

4-^ =;uff ici-e"^^^ vaiuti ^^raining. 3-* 

, teachers appear "ot .either ^---/.^^rtence appear t< 

•™r,r-ove classroom t;*! 
improve cVLlls. 
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T«bU 4 



4.44 ^ 

4.35 i 

4.29 3 

4.25 4 



g«Ea.h..£»I^^ ^. ^^^^ 

r^mpeiencies or Skills 

1. Ceding tel., P«P=". p™j«u, homework, 

2. M.king .«.s ren«( wh« is covered in tcx. .nd d.ss 

3 . C.lcul.«ing end of .em grades from term work 

4. Idcmifying individu.1 .nd cl.ss streng^s and weaknesses 

5. Deciding importance oftesls, p.pers. etc. in grnding 

6 . Delermining what needs to be ret^ught after test. 

7. Constricting tests that represent true student progress 
g. Deriving information from tests to guide students 

9. IdenUfying good and poor questions for future tests 

10. U« of observations (visual) to assess >nd guide learning 
U. Writing questions in harmony *ith school and Cass goals 

12. Interpreting test scores and student progress 

13. Use of tesu and grades to positively influence learning 

14. Setting up readable, scorable, and attractive tests 

15. Stating objectives suff.ciently clear to suggest test items 
,6. Writing test questions that demand higher thinking processes 

17. Selecting good test questions from teacher manuals 

18. Writing good matching questions 

19 . Writing good completion questions 

20. Writing good multiple-choice questions 

21. Writing good true-false questions 

22. Use of less formal assessments: checklisu, ratings, etc. 
25. Use of sociometric, guess who, and related technique, 

23. Scoring essay questions 

24. Writing good essay questions 

26. Calculation of means, standard deviations, reliability, etc. 

♦Means were derived from a 5-point Likert scale where 5 = high. 
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